Model Selection

Multi-image Reasoning

# Multi-image Reasoning

Minicpm V 2 6 Rk3588 1.1.4

MiniCPM-V 2.6 is a GPT-4V-level multimodal large language model supporting single-image, multi-image, and video understanding, optimized for RK3588 NPU

Transformers Other

MiniCPM-V 2.6 is the latest and most powerful multimodal large model in the MiniCPM-V series, supporting single-image, multi-image, and video understanding with leading performance and extreme efficiency.

Transformers Other

MMICL Instructblip T5 Xxl

MMICL is a multimodal vision-language model combining blip2/instructblip, capable of analyzing and understanding multiple images while following instructions.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase